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Abstract 

Invariancc with respect to linear or affine transformations of the domain is arguably the 
most common symmetry exhibited by natural algebraic properties. In this work, we show 
that any low complexity afhne-invariant property of multivariate functions over finite fields 
is testable with a constant number of queries. This immediately reproves, for instance, that 
the Reed-Muller code over ¥p of degree d < p is testable, with an argument that uses no 
detailed algebraic information about polynomials, except that low degree is preserved by 
composition with affine maps. 

The complexity of an affine-invariant property V refers to the maximum complexity, as 
defined by Green and Tao (Ann. Math. 2008), of the sets of linear forms used to characterize 
V. A more precise statement of our main result is that for any fixed prime p > 2 and 
fixed integer R > 2, any affine-invariant property V of functions / : — >• [R] is testable, 
assuming that the complexity of the property is less than p. Our proof involves developing 
analogs of graph-theoretic techniques in an algebraic setting, using tools from higher-order 
Fourier analysis. 
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1 Introduction 



The field of property testing, as initiated by [BLR93, BFL91] and defined formally by [RS96, 
GGR98], is the study of algorithms that query their input a very small number of times and 
with high probability decide correctly whether their input satisfies a given property or is "far" 
from satisfying that property. A property is called testable, or sometimes strongly testable or 
locally testable, if the number of queries can be made independent of the size of the object 
without affecting the correctness probability. Perhaps surprisingly, it has been found that a 
large number of natural properties satisfy this strong requirement; see e.g. the surveys [Fis04, 
Rub06, Ron09, SudlO] for a general overview. 

A fundamental problem in the area is then to find a combinatorial characterization of the 
testable properties. The characterization problem was explicitly raised even in the early work 
of [GGR98], and for dense graphs it was addressed in a long series of works culminating in 
[AFNS06] and [BCL+06]. 

In this work, we make steps towards such a characterization for the class of affine-invariant 
properties of multivariate functions over finite fields. Before stating our results, let us define 
some useful notions that will be helpful to know throughout this paper. 

1.1 Testability and Invariances 

Fix a prime p > 2 and an integer R > 2 throughout. Given a property V of functions in 
{Fp — > [R]}, we say that / : — )• [R] is e-far from V if miugg-p Pr^-gipn [/(x) ^ g{x)] > e, and 
we say that it is e-close otherwise. V is said to be testable (with one-sided error) if there is a 
function q : (0, 1) — ?■ Z"*" and an algorithm T that, given as input a parameter e G (0, 1) and 
oracle access to a function / : F^ — )■ [i?], makes at most q{e) queries to the oracle for /, always 
accepts if f £ V and rejects with probability at least 2/3 if / is e-far from V. 

As an example of a testable property, let us recall the famous result by Blum, Luby and Rubinfeld 
[BLR93] which started off this whole line of research. They showed that for testing whether a 
function / : F^ — )• Fp is linear or whether it is e-far from linear, it is enough to query the value 
of / at only 0(l/e) points of the domain. 

Linearity, in addition to being testable, is also an example of a linear-invariant property. We 
say that a property V C {F^ — )• [R]} is linear- invariant if it is the case that for any f £ V 
and for any linear transformation L : F^ — t- F^, it holds that f o L G V. Similarly, an affine- 
invariant property is closed under composition with affine transformations A : F^ — )• F^ (an 
affine transformation A is of the form L + c where L is linear and c is a constant). The property 
of a function / : F^ — )• Fp being affine is testable by a simple reduction to [BLR93] , and is itself 
affine-invariant. Other well-studied examples of affine-invariant (and hence, linear-invariant) 
properties include Reed-MuUer codes (in other words, bounded degree polynomials) [BFL91, 
BFLS91, FGL+96, RS96, AKK+05], homogeneous polynomials of bounded degree [KS08], and 
subspace juntas [VXll]. 

In general, invariance under a large group of symmetries seems to be a common trait of math- 
ematically natural properties, and in particular, affine invariance underlies most interesting 
properties that one would classify as "algebraic". Kaufman and Sudan in [KS08] made explicit 
note of this phenomenon and urged a study of the testability of properties with focus on their in- 
variance. In their paper, Kaufman and Sudan showed that linear affine-invariant properties are 
automatically testable but left open the general question. Note that arbitrary affine-invariant 
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properties are not testable; in fact, testing a random affine-invariant property requires querying 
nearly all of the domain. So, the question becomes: what is the minimal set of restrictions an 
affine-invariant property must satisfy in order to be testable? In order to state the conjectured 
answer to this question, as well as our progress here, we need to introduce some more notions. 

1.2 Hereditariness and Induced AfRne Constraints 

We now introduce the subclass of affine-invariant properties which, we believe, captures every 
property testable with a 1-sided error test. 

Definition 1.1 (AfRne subspace hereditary properties) An affine-invariant property V 
is said to be affine subspace hereditary if for any / : — )• [R] satisfying V, the restriction 
of f to any affine subspace of¥p also satisfies V. 

Affine subspace hereditariness thus provides something like a uniformity condition, relating the 
definition of the property for different values of n. Specializing the conjecture in [BGSIO] for 
linear-invariant properties to affine-invariant properties gives the following: 

Conjecture 1.2 ([BGSIO]) Any affine subspace hereditary property is testable with 1-sided 
error. 

Moreover, [BGSIO] show that every affine-invariant property testable by a "natural" tester is very 
"close" to an affine subspace hereditary property^. In fact, resolving Conjecture 1.2 would yield 
a combinatorial characterization of the (natural) one-sided testable affine-invariant properties, 
similar to the characterization for dense graph properties [AS08a]. 

Before proceeding, let us give some examples of affine subspace hereditary properties in order 
to build intuition about how to test them. Consider the property of being affine, by which 
we mean here that the function is a polynomial of degree at most 1. This is clearly an affine- 
invariant hereditary property. As we remarked earlier, the property is known to be testable. 
Note that here, we could also have defined being affine as the condition of satisfying the identity 
f{x) — f{x+y) — f(x+z)+ f{x+y-'i-z) = for every x, y, z G F^. This is a "local" characterization 
of being affine, in the sense that the functional equation does not depend on the value of n. 
Moreover, this characterization automatically suggests a 4-query test: pick random x,y,z G F^ 
and check whether the identity holds or not for that choice of x, y, z. 

More generally, consider the property of being a polynomial of degree at most d, for some fixed 
positive integer d. Again, the property is clearly affine subspace hereditary. It is also known 
to be testable [AKK+05] over finite fields. And just as in the case of linearity, the test arises 
out of a local characterization for degree d: the {d -\- l)th derivative in every d-\-l directions at 
every point should be 0. The test is then to choose a random point and random d-\-l directions 
and to check whether the {d-\- l)th derivative in the chosen directions at the chosen point is 
or not. 

In fact, one can describe any affine subspace hereditary property using (finitely or infinitely 
many) such local characterizations. To state this formally, let us put forth a useful definition. 

^We omit the technical definitions of "natural" and "close" here, since they are unimportant here. Informally, 
the behavior of a "natural" tester is independent of the size of the domain and "close" means that the property 
deviates from an actual affine subspace hereditary property on functions over a finite domain. See [BGSIO] for 
details, or [AS08a] for the analogous definitions in a graph-theoretic context. 
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Definition 1.3 (Affine constraints) 



• An affine constraint of size m on i variables is a tuple A = (ai, . . . ,am) of m linear 
forms oi, . . . , over ¥p on i variables, where ai{Xi, . . . , Xg) = Xi and for every i > 2, 
ai{Xi, . . . , Xi) = Xi + Ylj=2 ^i,j-^o where each Cij G Fp. 

• An induced affine constraint of size m on I variables is a pair {A, a) where A is an affine 
constraint of size m on i variables and a £ [R]^. 

• Given such an induced affine constraint (A, a), a function / : — )• [R] is said to be {A, a)- 
free if there exist no xi, . . . , xi £ ¥p such that (/(ai(xi, . . . , xi)), . . . , f{am,{xi, . . . , xe))) = 
a . On the other hand, if such xi, . . . ,xg exist, we say that f induces {A, a) at xi, . . . ,xi. 

• Given a (possibly infinite) collection A = {{A^,(j^), (^^,cj^), . . . , (^*,(T*), . . . } of induced 
affine constraints, a function / : F^ — t- [R\ is said to be ^-free if it is {A^ , a^)-free for 
every i > 1. 

The connection between affine subspace hereditariness and affine constraints is given by the 
following simple observation. 

Observation 1.4 An affine-invariant property V is affine subspace hereditary if and only if it is 
equivalent to the property of A-freeness for some fixed collection A of induced affine constraints. 

Proof: Given an affine invariant property V, a simple (though inefficient) way to obtain 
the set A is to let it be the following: For every n and a function / : F^ that is not in V, 
we include in A the constraint {Af,af), where ^/ is indexed by members of F^ and contains 
{az{Xi, Xn+i) = Xi + YJi^i ZiXi+i : z = {zi, . . . ,Zn) G F^}, and a/ is just set to /. From 
here it is easy to see that the property defined by A is contained in V, while containment in 
the other direction follows from V being affine-invariant and hereditary. 

The other direction of the observation is trivial. ■ 

Thus, resolving Conjecture 1.2 boils down to showing testability for all .A-freeness properties. 
1.3 Main Result 

We show that A.-freeness is testable as long as all affine constraints in A are of complexity 
less than p. We next define the complexity of an affine constraint, and more generally, of an 
arbitrary set of linear forms. 

Definition 1.5 (Cauchy-Schv^farz complexity, [GTlOb]) Let C = {Li, . . . , Lm} be a set of 

linear forms. The (Cauchy-Schwarz) complexity of C is the minimal s such that the following 
holds. For every i £ [m], we can partition {-^j}je[m]\{j} ^'^^o s -|- 1 subsets such that Li does not 
belong to the linear span of any subset. 

Given this, one can formulate a conjecture that is a weakened version of Conjecture 1.2: 

Conjecture 1.6 A property that is given by a collection of induced affine constraints with a 
global bound on their complexity is testable with a 1 -sided error. 
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The following is our main result, which shows the above when the complexity bound is strictly 
smaller than the field size. 

Theorem 1.7 (Main theorem) For any e G (0, 1) and for any (possibly infinite) fixed collec- 
tion A = {{A^ : (J^), (^^,0"^), . . . , (A*, o"*), . . . } of induced affine constraints such that each 
has complexity less than p, there is a function : (0, 1) — )• Z+ and a one-sided tester which 
determines whether a function / : — t- [R] is A-free or e-far from being A-free, by making at 
most g^(e) queries to f. 

The function qj[ has a rather horrible, Ackermann function-like, dependence on 1/e. Our pri- 
mary concern in this work though is to establish testability, and we make no effort in improving 
the growth of g^. We note though that recent work by Kalyanasundaram and Shapira [KSll] 
and by Conlon and Fox [CFll], building on previous work by Gowers [Gow97], suggests that 
the very rapid growth of the query complexity function is in fact inherent in the nature of the 
problem. 

Let us lastly note that Theorem 1.7 is quite nontrivial even when the collection A is finite. 
Indeed, even if A consists only of a single induced affine constraint of complexity greater than 
1, it was not known previously how to show testability. We give more details about past work 
in Section 1.5. 



1.4 Overview of the Proof 



To show Theorem 1.7, we will in fact show the following statement. Note that it uses a yet 
undefined notion of "conciseness"; for now it suffices to know that every A is equivalent to a 
concise one, as we will later prove. 



Theorem 1.8 Suppose we are given a possibly infinite collection of labeled affine constraints 
A = {{A^ , a^), {A^ , a'^), {A^,a^), ...} where A is concise, every A is of complexity less 
thanp and consists of mi linear forms overly variables, and a"^ G [i?]"*' for every i. Then, there 
are functions ^^(•) and dj[{-) such that the following is true for any e G (0, 1). If a function 
/ : Fp — )• [R] with is e-far from being A-free, then f induces at least (5^(e) • p"^' many copies of 
{A^, o"*) for some i such that ii < ^^(e). 



Theorem 1.7 immediately follows. Consider the following test: choose uniformly at random 
xi,..., x^^(g) G Fp, let H denote the affine space {xi -|- Yl^j=2^ ^^j^j '■ ^ ^p}' ^^"^ check 
whether / restricted to H is ^-free or not. By Theorem 1.8, if / is e-far from ^-freeness, 
then this test rejects with probability at least (5^(e). Repeating the test 0(l/5^(e)) times then 
guarantees a constant rejection probability. And of course, if / is ^-free, the test always accepts. 

Let us now give an overview of our proof of Theorem 1.8. For simplicity of exposition, assume 
for now that A consists only of a single induced affine constraint {A, a) where A is the tuple of 
linear forms (ai, . . . , am); each over I variables, and a G [R\^ . For i G [i?], let : F^ {0, 1} 
be the indicator function for the set ({«})• Our goal will then be to show that, when / is 
e-far from [A, c7)-free, then: 



E 

X\,...,Xi 



/('^i)(ai(xi, . . . ,x,)) • /("^)(a2(:ri, . . . ,x,)) • • • f'^''^\a^{xi, x,))] > 5{e), 
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where crucially, (5 is a positive function that does not depend on n. If we could 
show this, then we would be done since a valid test would be to repeat the follow- 
ing procedure 0(1/5) times: uniformly pick xi,...,xc £ and immediately reject if 
(/(ai(xi, . . . , xe)), . . . , f{am{xi, Xi))) = a. 

Studying averages of products, as in (1), has been crucial to a wide range of problems in additive 
combinatorics and analytic number theory. Szemeredi's theorem about the density of arithmetic 
progressions in subsets of the integers is a classic example. Szemeredi's work [Sze75] arguably 
initiated such questions in additive combinatorics, but the major development which led to 
a more systematic understanding of these averages was Gowers' definition of a new notion of 
uniformity in a Fourier-analytic proof for Szemeredi's theorem [GowOl]. In particular, Gowers 
introduced the Gowers norm \\ ■ ||^d for a parameter d > 1, which allows us to say the following 
about (1). If < e (for some d), /2, • • ■ , /m are arbitrary functions that are bounded in- 

side [—1, 1], and Li, . . . , Lm are arbitrary linear forms, then ^xi,...,xeeF^ [Ili^i • • • ) ^i))] 

is at most e. 

This observation leads to the study of decomposition theorems, that express an arbitrary function 
as a linear combination of functions which have either small Gowers norm or are structured in 
some sense. This is an extension of classical Fourier analysis over F^, where a function is 
expressed as a linear combination of a small number of characters with high Fourier mass plus a 
small error term. To deal with Gowers norm, the "characters" need to be exponentials of not only 
linear functions, as in classical Fourier analysis, but of higher degree polynomials. Approximate 
orthogonality among these "characters" was established by Green and Tao in [GT09] and by 
Kaufman and Lovett in [KL08]. At this stage, one might expect that results by Hatami and 
Lovett [HLlla, HLllb] can allow us to use orthogonality to approximate the expectation of the 
form in (1). 

Unfortunately, the proof does not follow that easily from [HLlla]. There are two main reasons 
for this. The first is that the only information we have about the original function / is e-farness 
from (^4, (T)-freeness. Information about correlation, as was assumed in [HLlla], allows more 
straightforward application of the higher-order Fourier analytic tools. We use ideas inspired by 
previous work on property testing in the dense model, as in [AFKSOO] and [AS08b], to locate 
regions of the domain in which we are guaranteed to find at least one induced occurrence of 
(A, a). This leads to a new combinatorially flavored decomposition theorem (Theorem 4.12), 
which may be of independent interest. 

The second problem we face is one which also arose in a work by Green and Tao on decomposition 
theorems (a.k.a., regularity lemmas) over the integers [GTlOa]. Namely, the decomposition 
theorem we use decomposes an arbitrary function / : F^ — )• M to a sum of three functions 
/i) /2) /s- fi consists of the approximate "characters" as mentioned above, /2 has small Gowers 
norm, and has low L^-norm. Now, the closeness to orthogonality for fi and the smallness 
of the Gowers norm for /2 decreases as a function of the "complexity" of the decomposition, 
and are thus, essentially negligible for the purposes of the proof. On the other hand, the bound 
on the L^-norm for /s is only moderately small and cannot be made to decrease as a function 
of the complexity of the decomposition. To get around we essentially use a sequence of two 
decompositions, and make the norm of the second one decrease as a function of the complexity 
of the first, where we show that this is enough for our purposes. 

1.5 Previous Work 

This work is part of a sequence of works investigating the relationship between invariance and 
testability of properties. As described, Kaufman and Sudan [KS08] initiated the program. 
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Subsequently, Bhattacharyya, Chen, Sudan and Xie [BCSXll] investigated monotone linear- 
invariant properties of functions / : F2 — s- {0, 1}, where a property V is monotone if it satisfies 
the condition that for any function g G V, modifying g by changing some outputs from 1 to 
does not make it violate V. Krai, Serra and Vena [KSV12] and, independently, Shapira [Sha09] 
showed testability for any monotone linear-invariant property characterized by a finite number 
of linear constraints (of arbitrary complexity) . 

Progress has been significantly slower for the non-monotone properties. Bhattacharyya, Grig- 
orescu, and Shapira proved in [BGSIO] that linear-invariant properties of functions in {Fg — )• 
{0, 1}} are testable if the complexity of the property is 1. When restricted to affine-invariant 
properties, the result of [BGSIO] is a special case of the main result here for p = 2. The previous 
works did not explicitly use higher-order Fourier analysis; [KSV12] and [Sha09] used variants 
of the hypergraph regularity lemma which are similar in spirit to higher-order Fourier analysis, 
but are somewhat harder to manipulate due to the lack of analytic tools. 

Higher-order Fourier analysis began with the work of Gowers [Gow98] and parallel ergodic- 
theoretic work by Host and Kra [HK05] . Applications to analytic number theory inspired much 
more study by Gowers, Green, Tao, Wolf, and Ziegler among others. A book in preparation 
by Tao [Taoll] surveys the current theory of higher-order Fourier analysis. Our work in this 
paper relies on decomposition theorems over finite fields of the type first explicitly described by 
Green in [Gre07]. We also heavily use decomposition results by Hatami and Lovett [HLlla], as 
described in the previous section. 

At a high level, the argument to prove our main theorem mirrors ideas used in a sequence of 
works [AFKSOO, AS08b, AS08a, FN07, AFNS06, BCL+06] to characterize the testable graph 
properties. In particular, the technique of simultaneously decomposing the domain into a coarse 
partition and a fine partition with very strong regularity properties is due to [AFKSOO], and 
the compactness argument used to handle infinitely many constraints is due to [ASOSb]. How- 
ever, implementing these graph-theoretic techniques using higher-order Fourier analysis required 
several new ideas which, we hope, can be extended to eventually prove Conjecture 1.6. 

1.6 Further research 

We study affine subspace hereditary properties, and show that if they are defined by affine 
constraints of low complexity then they are locally testable. There are several obvious possible 
generalizations to this work: 

1. Remove the condition that the field size is larger than the complexity of the affine forms, 
thus proving Conjecture 1.6; this requires non-trivial generalizations of several technical 
lemmas to small fields, and may require new methods. 

2. Handle all linear invariant properties (and not just affine invariant properties). 

A third generalization, which might be too strong to hold, is to completely remove the bounded 
complexity assumption on the linear forms, thus proving Conjecture 1.2. In several analogs of 
this line of research in hypergraph testing, this requirement is analogous to requiring bounded 
uniformity from the hypergraphs, which is implicitly assumed in all previous works on hyper- 
graph testing. It would be thus also be interesting if the full Conjecture 1.2 can be disproved. 
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2 Map of the proof 



The rest of this section wih be devoted to an informal description of the building blocks required 
to prove Theorem 1.7, and by extension Theorem 1.8. We believe that some of these building 
blocks, and especially the "Super Decomposition" Theorem 4.9 that we describe below, will be 
of independent interest. 

In Section 3 and Section 4, we develop the main technical tools that we will need for our 
testability proof. Some of the following lemmas and arguments were proved before: Decompo- 
sition lemmas (without rank) were implicit in previous works by Green and Tao and explicit 
in [HLlla]; the existence of a refinement of a given rank was first proved in [GT09] (which is 
combined here with a decomposition lemma); other prior works are cited along with the proofs 
below. 

Our new contributions there lie in the following: 

• Our final "Super Decomposition" Theorem 4.9, and its related "Subcell Selection" Corollary 
4.10, are new. Their relation to the original decomposition lemma could be thought of as 
somewhat akin to the relation of the strong graph regularity lemma in [AFKSOO] to the 
original regularity lemma of Szemeredi. 

• For the subcell selection corollary to work at all, we need to take careful count of when 
is a refinement of a partition by polynomials syntactic (i.e. there is a containment re- 
lationship between the polynomials defining the two partitions) or merely semantic (i.e. 
the polynomials may be different but the partitions they define satisfy a combinatorial 
refinement relationship). We add the accounting of syntactical vs semantic refinements to 
all the arguments leading up to our super decomposition theorem. 

• We set the entire analysis in a "robustness" framework akin to the one developed for 
graphs in [FN07]. This streamlines the argument (essentially allowing us to encapsulate 
and move away iterative refinement arguments), which could get very unwieldy by the 
time the super decomposition theorem is reached. 

In Section 5, we then develop algebraic and combinatorial constructions, that allow us to 
use Corollary 4.10 to provide counting type theorems, and in our case the main "algebro- 
combinatorial" Theorem 1.8. The algebraic part mostly involve procedures that calculate the 
numbers of affine configuration of a given type that satisfy given polynomial constraints; we 
also prove, using basic algebra, that we can assume the technical condition of ^ being "concise", 
that is not having more variables than conditions in any of its constraints. The combinatorial 
part is the "cleanup" procedure that we describe below. 

We now describe the main components of our proofs in detail. 
2.1 Partition by Polynomial Factors 

We generally deal with a function / : F" — )• {0, 1} (where a larger fixed size range [R] is handled 
by considering a sequence of functions rather than one function - see Subsection 4.3), and would 
like to partition its domain Fp into a small number of regions, so that / has certain "randomness" 
properties in every region (or at least most of them). In the broadest terms, we seek algebraic 
analogs to Szemeredi's regularity lemma and its derivatives that have revolutionized graph 
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theory. Recall that Szemeredi's lemma partitions the vertex set of the graph so that most 
vertex set pairs exhibit random- like properties in the bipartite subgraphs that they induce. 

The groundwork providing this started with the works of Green and Tao. In general, a function 
/ : Fp — )• {0, 1} can be decomposed to a sum of three real-valued functions. One that is constant 
on large regions of the input, one that generally takes small values (in terms of its I2 norm), 
and one that is "very random" (in the sense of the Gowers norm). The relevance of the Gowers 
norm to our arguments is highlighted in Subsection 3.1. 

In an ideal world, the large regions of the input over which we have a constant function should 
come from a partition of into affine subspaces, but in fact this cannot be the case. The next 
best thing is to have a partition based on the values of a fixed length sequence of low degree 
polynomials over F^. These are called polynomial factors as per Definition 3.4, and the regions 
of Fp of their respective partitions are called cells. 

However, now we need to re-address the question of independence. Standard linear independence 
would be insufficient to even guarantee that all regions are of similar sizes, let alone provide 
other "randomness" features. For this we use the notion of polynomial rank, first developed in 
[GT09]. Subsection 3.2 provides the details about polynomial factors and their rank. 

2.2 Refinements and the Robustness Framework 

For our purpose it is not enough to prove the existence of certain factors, and we will consider a 
relationships between pairs of factors, namely the refinement relationship. There are two kinds of 
refinements. The "combinatorial" semantic refinement notion means that the partition induced 
by the second factor consists of subsets of the sets of the first factor, while the "explicit" syntactic 
refinement notion means that the second factor is in fact defined by a sequence of polynomials 
extending the sequence that defines the first factor. Definition 3.9 provides the details. 

An important measure of a factor with respect to a function / : F^ — t- {0, 1} is its density index, 
as per Definition 3.11. This was used in previous decomposition proofs, and is analogous to the 
index of a graph partition used in the proof of Szemeredi's regularity lemma and its variants. 
In Subsection 3.3 we introduce and analyze the framework of factor robustness, where a factor 
is considered robust if it cannot be refined (with respect to a size bound given as a function of 
the current size) in a way that significantly increases its index. Robust factors, including ones 
that refine existing factors, exist by a simple argument, Observation 3.13. 

The robustness framework greatly simplifies the arguments used to prove the decomposition 
theorems in Section 4. Where previously such proofs used an iterative argument, basically 
repeating a construction of a refining factor as long as the factor does not provide the required 
properties, in the proofs here we start with a robust factor and then show that it provides the 
required object. 

However, we need a factor to be both robust and of high rank. The high rank requirement 
(also as a function of the factor size) is in fact also provided through an iterative argument 
resembling the proof of regularity. In Subsection 3.4 we integrate arguments similar to those 
originally made in [GT09] to provide Lemma 3.19, the driving engine of our decomposition 
theorems. This lemma provides factor that is both robust and of high rank. Moreover, if we 
start from an existing factor that is a syntactic refinement of a base factor that also has high 
rank, then our new robust factor will additionally be a syntactic refinement of the same base 
factor. This is crucial to our super decomposition theorem, that requires such a refinement to 
be provided. 
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2.3 Decompositions and Super Decompositions 

Chronologically, decomposition theorems for functions / : — )• {0, 1} have progressed in stages. 
First a weak decomposition theorem was shown, where a factor is found and / is decomposed 
into a sum of two functions, / = /i + /2, where /i : — >■ [0, 1] is constant over every cell of the 
factor, and /2 : Fp — t- [— 1, 1] has a bounded Gowers norm. In an ideal world we would like /2 
to have a bounded I2 norm, as it denotes an "error" of some kind, but this is not possible. 

However, for the Gowers norm bound to be of any use, it has to be bounded as a decreasing 
function of the factor size C. The next step was then to find a factor and a decomposition 
/ = /i + /2 + /a, where /a is an "error" term that is of bounded I2 norm (as we originally 
intended), and /2 now has a Gowers norm that is smaller than the required function of C. The 
proof "internally" uses a sequence of two factors, one refining the other, and a corresponding 
"iterated argument of iterated arguments". However here we can encapsulate it through a 
robustness requirement. We provide the full details in Subsection 4.1, which culminates in 
Theorem 4.4, providing also a rank requirement. It is similar to theorems proved in previous 
works, but here we also maintain a syntactic refinement relationship to a base factor, a feature 
that will be used later. 

This brings us to our new super decomposition Theorem 4.9. Its motivation is that for our 
purpose, we would also need the I2 norm of the error function fs to decrease as a function of 
the factor size. This is required because for our analysis of non-monotone properties, we cannot 
make do with most of the cells of the factor exhibiting a random-like behavior of / - we would 
like all of them to exhibit it. However, such a demand on /s is clearly not possible. 

The solution is then to provide a sequence of two factors, where the second factor is a syntactic 
refinement of the first. We then decompose / with respect to the second factor, as a sum of 
a constant-over-cells function /i, a small Gowers norm function /2, and a function whose I2 
norm is not small as a function of the second factor, but at least it is small as a function of 
the first factor. Additionally, we want /i to be "faithful" also with respect to the first factor: 
That is, if we had decomposed / according to the first factor rather than the second, then 
the corresponding "/i function" would still be close in most places to the function we got by 
decomposing according to the second factor. 

In the next step of the proof of our main testability theorem, we will pick one "subccll", a cell 
of the second factor, out of every cell of the first factor. We will want most of these cells to be 
faithful (with respect to /i) and all of them to exhibit the randomness properties. The syntactic 
refinement relationship in our super decomposition theorem is what allows us to pick these cells 
in a "uniform" manner, as per our subcell selection Corollary 4.10. 

We believe that Theorem 4.9 and its proof methods are of independent interest, as they could 
open up possibilities for more analogies to the big body of knowledge concerning the applications 
of Szemeredi's lemma and its variants for graphs. 

2.4 Function Cleanup 

To find many induced structures in /, we restrict ourselves to the "good" subcells chosen by use 
of Corollary 4.10. However, to find the correct configuration of subcells exhibiting the induced 
structures, we refer to a modification of / called a cleanup. The modified / will be close to the 
original, and hence will still contain an induced structure. This particular structure might not 
exist in the original /, but the way the cleanup is performed, as per Definition 5.14, ensures the 
existence of the corresponding subcell configuration which "mimics" the location of the points of 
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the structure (even that it may not actually contain those points). We then use the configuration 
of subcells with respect to the original / to find our affine structures. 

This argument is in fact somewhat analogous to the argument considering forbidden induced 
subgraphs that appeared first in [AFKSOO]. The function closeness lemma is Lemma 5.15, while 
the mimicking subcell argument is found in the proof of Theorem 1.8 in Subsection 5.4. 

2.5 Randomness and consistency 

After we find the subcell configuration corresponding to an affine induced structure, we still 
need to lower-bound the number of actual copies of the structure that it guarantees for /. This 
requires giving a lower bound for the number of actual small affine sets that reside in this 
configuration, and within them the number of sets for which / has the corresponding values. 
The second task is in fact accomplished by the function decomposition that we have. For 
the first task, we build upon works of Hatami and Lovett [HLllb] and of Gowers and Wolf 
[GWlOb, GWlOa] in Subsection 5.1. 

We use there the notion of consistent values, Definition 5.5, as an algebraic characterization of 
when is a configuration of cells feasible for a given affine structure. This allows us to regulate 
"all-or-nothing" lemmas from previous works in Theorem 5.7, to provide a calculated bound for 
the number of structures. We also utilize it for Lemma 5.8, showing that the subcell selection 
process does not "spoil" a good configuration. 

2.6 Wrapping Up 

There are some final ingredients that we need before finalizing the proof of Theorem 1.8. One 
of which is a compactness argument, analogous to the one made in [AS08a], to be able to 
bound the size of the constraints we need to test for, even when the property is defined by an 
infinite number of constraints. In our case, we also need to perform a slight "preprocessing" to 
representation of the property, to make it concise as per Definition 5.19, which is done through 
Lemma 5.18. Apart from this. Subsection 5.3 contains a few other algebraic tools that help 
with the calculations used in the proof. 

Finally, Subsection 5.4 contains the proof of Theorem 1.8, tying it all together, from finding a 
factor with a subcell selection, through consistency and randomness arguments, to finally using 
the function cleanup to bound from below the number of copies of the corresponding induced 
structure. 

3 Tools of the Proof 

In this section we lay the groundwork for the decomposition theorems that follow. This include 
the formal definition of partition by polynomial factors, the definition of factor robustness and 
rank with proofs of their impact, and finally we prove the main lemma about the existence of 
partitions that are both robust and of high rank. 
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3.1 Functions and Norms 



In the most general setting we consider functions / : G — )• C, where G is a finite Abelian group^. 

Unless stated otherwise, expectations are taken over the uniform probability space with respect 
to the relevant range, e.g. Ea;[/(x)] is set to J2xeG /(^)- Apart from the traditional norms 



such as 



], we will make extensive use of Gowers norms. 



Definition 3.1 (Gowers norm) Let G be a finite Abelian group and f : G 
integer k > 1, the k'th Gowers norm of f, denoted \\f\\ui'> is defined by: 



C. For an 



2k 
JJk 



E 



where C denotes the complex conjugation operator, i.e. C\a + bi) = a + (— l)'6i for a,6 G M and 
integer I. 



Two facts about the Gowers norm will be absolutely crucial in what follows. First is the Gowers 
Inverse theorem, established by [BTZIO, TZIO]. Throughout, we let e (x) denote the complex 
number e^'^'^^lv for x G F„. 



Theorem 3.2 (Gowers Inverse Theorem) Given positive integers d < p, for every 6 > 0, 
there exists e = e3.2((5,p) such that i/ / : — M satisfies \\f\\oo < 1 and > 5, then 

there exists a polynomial P : F" — t- Fp of degree at most d so that \ Kx[f{x) ■ e (P(x))]| > e. 



The second is a lemma due to Green and Tao [GTlOb] based on repeated applications of the 
Cauchy-Schwarz inequality. Refer to Definition 1.5 for the term "complexity". 



Lemma 3.3 Let fi, . . . , fm '■ Fp — )• [—1,1]. Let C = {Li, . . . , Lm} be a system of m linear 
forms in £ variables of complexity s. Then: 



E 

Xi,...,XiS:¥V: 



Y[fi{Li{xi, . . .,xi,)) 



.i=l 



< min ||/j||c;-+i 



3.2 Polynomial Factors and their Rank 

While partitioning the domain to affine linear subspaces would be the most intuitive for counting 
affine cubes, we in fact need higher degree algebraic partitions. 

Definition 3.4 (Polynomial factor) A polynomial factor B is a sequence of polynomials 
Pi, . . . , Pc '. Fp —7- Fp. We also identify it with the function B : ¥p ^ ¥p sending x to 
{Pi{x), . . . , Pc{x)). A cell of B is a preimage B~^{y) for some y G F^. On the other hand, 
given a cell of B, the common value y = B{x) G F^ is called the image of the cell. When there 
is no ambiguity, we will in fact abuse notation and identify a cell of B with its image y. 

The partition induced by B is the partition o/F^ given by \^B^^{y) : y G F^}. The complexity 
of B is the number of defining polynomials \B\ = C. The degree of B is the maximum degree 
among its defining polynomials Pi, ... , Pc ■ 

^Later we would mostly consider G = Fp . Our main tlieorem is formulated for functions whose range is {0, 1}, 
but its proof uses interim function with larger ranges. 
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Next, we define the notion of conditional expectation with respect to a given factor. 



Definition 3.5 (Expectation over polynomial factor) Given a factor B and a function 

/ : Fp — >• {0, 1}, the expectation of f over a cell y G ¥p is the average Ex:B{x)=y[f{x)]; which 
we denote byK[f\y]. The conditional expectation of f over B, is the real-valued function over 
¥p given by K[f\B]{x) = K[f\B(x)]. In particular, it is constant on every cell of the polynomial 
factor. 



In essence we would want to choose a polynomial factor so that, among other things, the 
restriction of / in every cell would essentially consist of a constant element and other elements 
of small norms. However, since we are not dealing with affine linear subspaces, for our arguments 
to follow we also need the factor itself to be "well behaved". This is exemplified in the notion 
of polynomial rank [GT09], in essence a strengthening of linear independence. 



Definition 3.6 (Rank of polynomial factors) Suppose thatB is a polynomial factor defined 
by polynomials Pi, ... , Pc ■ — )• Fp. The rank of B is the largest integer r such that for every 



(«!, . . . , ac) G Fp \ {O'^}, the polynomial P^ = Yl?=i o^iPi cannot be expressed as a function of 
r polynomials of degree d — 1, where d = maXjg[(^].^._^o deg(Pj). 

The rank of a single polynomial P is defined similarly (but without needing to relate to linear 
combinations). 



The following result, proved by Kaufman and Lovett [KL08] for all p (extending previous work 
of Green and Tao [GTlOb] over large characteristic fields), is crucial: 

Theorem 3.7 For any e > and integer d > 1, there exists r = r3j{d,e) such that: If 
P : Fp — )• Fp is a degree-d polynomial with rank at least r, then \ ¥x[& {P{x))]\ < e. 

As an example of how useful Theorem 3.7 is, consider the following simple lemma which states 
that every cell of a polynomial factor with large enough rank has approximately the same size. 

Lemma 3.8 Given a polynomial factor B of degree d, complexity C , and rank at least r^^iid, e) 
generated by the polynomials Pi, ... , Pc ■ F^ Fp, and an element b G F^, we have that: 

Pr \B(x) = b]= p-^ ± e 



Proof: This is implicit in previous work, e.g. [Gre07]. For completeness, we repeat the 
argument: 



Pr [B{x) = b]=E 



UlU e{X,-{P,{x)-h)) 



X] 



yie[c] 



(Ai,...,Ac)eFC 

= p~^ {l±p^e) 

where the last line uses Theorem 3.7 whenever (Ai, . . . , Ac) ^ 0*^. 
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3.3 Refinement and Robustness 



The decomposition theorems wiU iteratively partition the domain into finer and finer par- 
titions (though we will use a mechanism that hides the refinements that do not have to be 
"visible" for the other proofs). We will need to be careful about distinguishing between two 
different types of refinements. 

Definition 3.9 (Refinement of a polynomial factor) B' is called a syntactic refinement of 
B, and denoted B' disyn if the sequence of polynomials defining B' extends that of B. It is 

called a semantic refinement, and denoted B' :<sem S if the induced partition is a combinatorial 
refinement of the partition induced by B. In other words, if for every x,y G Fg, B'{x) = B'{y) 
implies B{x) = B{y). The relation :< (without subscripts) is a synonym for :<syn- 

Clearly, being a syntactic refinement is stronger than being a semantic refinement. However in 
essence, these are almost the same thing. 

Observation 3.10 IfB' is a semantic refinement of B, then there exists a syntactic refinement 
B" of B that induces the same partition o/F^, and for which \B"\ < \B'\ + \B\. 

Proof: Just add the defining polynomials of B to those of ■ 

On the other hand, doing the above conversion can "destroy" the rank of a polynomial factor, 
and there will be indeed situations in what follows where we will have to carefully distinguish 
the two refinement types. 

Next, we define the density index of a polynomial factor with respect to a function, and use it 
to define the notion of robustness, which is central to what follows. 

Definition 3.11 The density index of a factor B with respect to a function f is the squared I2 
norm of the conditional expectation of f, that is indd(B) = E [(E[/|H])^] . 

Given a function /i : N — >■ N and a real parameter 7, A factor B is (/t, 7)-robust (semantically) 
if there exists no B' which is a semantic refinement of B for which \B'\ < h{\B\) and indd(H') > 
indd(^) + 7. 

Robustness is somewhat preserved when moving to a small refinement. 

Observation 3.12 IfB is (go h,'y) -robust, and B' is a (syntactic or semantic) refinement of 
B for which \B'\ < h{\B\), then B' is {g,'^) -robust. 

Proof: If B" is any refinement of B' for which \B"\ < gi\B'\), then \B"\ < g{h{\B\)) and so 
indd(5") < indd(S) + 7. On the other hand by the Cauchy-Schwarz inequality indd(H') > 
indd(i3), and so indd(i3") < indd(i3') + 7, proving the robustness condition of B'. ■ 

Existence of robust factors, also as syntactic refinements of a given factor, is easy to prove. Note 
that the function in its statement takes another function as one of its parameters. 

Observation 3.13 For an appropriate function T-^ i^lk, h, 7), for any B, /i : N — t- N and 7 > 
there exists a syntactic refinement B' which is {h,'y) -robust, and for which \B'\ < Ta.iad^SI, /i, 7). 



13 



Proof: Without loss of generality we assume that h is monotone non-decreasing (otherwise 
replace h{k) with maxj<fc h{j)). Set Bq = B. Inductively, if Bi is not already (/i, 7)-robust then 
set B[ to be a semantic refinement of Bi for which \B[\ < h{\Bi\) and indd(;S') > indd(;B) + 7, 
and by Observation 3.10 then set Bi+i to be a syntactic refinement of B and for which 
\Bi+i\<h{\Bi\) + \B\. 

Noting that the index can only increase while moving to a refinement (by the Cauchy-Schwarz 

inequality), this process must stop for some j < I/7. Bj is the required factor, and its size is 
bounded by ki/^, where we define ko = k and by induction ki^i = h{ki) + k. M 

Note: Prom now on we assume that all our relevant functions are monotone in their corre- 
sponding variables, also when this is not stated explicitly. For example, a function h fed to 

Observation 3.13 will assumed to be monotone non-decreasing, and if A; < fc', 7 > 7', and 
h{rn) < h'{m) for every m G N (while both h and h' are monotone non-decreasing), then 
Ts.isik, h, 7) < T3.i3(/c', h', 7'). All our lemmas can indeed be made to provide such functions. 

3.4 Robustness with Rank 

The next item on the agenda is to show that polynomial factors can be refined to ones of high 
rank. The following index definition is used for analyzing rank. 

Definition 3.14 The degree index of a factor B is the (infinite but almost everywhere zero) 
sequence of non-negative integers indm(i3) = I = {ii,i2, ■ ■ ■), where ik is the number of polyno- 
mials of degree k in the sequence of polynomials defining B. 

Denote the set of all possible degree sequences as above by I. Over I we define the anti- 
lexicographic order, where I < I' if ik < i'k for the largest k on which those coordinates differ. 

The set T defined above is well-ordered in the sense that there exist no infinite strictly decreasing 
sequences of members of I, but this still does not provide for "standard" induction, as the order 
is not isomorphic to N. To replace induction we define the notion of a decrement. 

Definition 3.15 Let X denote the well-ordered set of all possible degree indexes. A function 
K :NxI I is called a decrement if for all A El and n eN it satisfies K{n, A) < A, for all n 
and A < B it satisfies «;(n. A) < K{n, B), and for alln <m and A it satisfies K{n, A) < K{m, A). 
The inequalities are with respect to the anti-lexicographic ordering of I. 

The following shows how, when we are given a decrement that "bounds" some process, we can 
use it to bound an iterative process. 

Lemma 3.16 There exist T^^iQ{k,d^h, n) and m^_\Q{k,d,h, k) that take numbers k and d, 
a monotone /i : N — t- N and a decrement k, and satisfy the following. If Bq,Bi, . . . Bm 
is a sequence of factors of bounded degree d for which \Bq\ < k, \Bi\ < h{\Bi-i\) and 
indm(Hi) < k (|;Si_i|, indm(;Si_i)), then \Bm\ is bounded by T^,iQ{\B\,d,h.,K) and m is bounded 
by m3.i6{\B\,d,h,K). 

Proof: Let Iq be the maximal (with respect to order) degree index of any degree d factor of 
complexity k, which is the sequence (ii, 12, • • •) for which id = k and ij = for any j / d, and 
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let ho = k. Inductively define hi €N as hi = h{hi-i), and = /j-i). Because /q, /i, . . . 

is a decreasing sequence over a well-ordered set, it must be of bounded length, which we denote 
as mz.i&{k,d,h, k). We then set T^,iQ{k,d,h, k) = /ima i6(fc,d,/i,K)- ^ sequence of factors as 
above, the monotonicity conditions of k ensure that \Bi\ < hi and indm(;Si) < li, and so we are 
done. ■ 

The following provides a decrement that will bound the process of obtaining a high rank refine- 
ment of a given factor, as well as a bound on the size increment. Note that also if the required 
rank depends on the factor size, we can still get a bounding decrement. 

(r) (r) 

Lemma 3.17 For every r : N — )■ N there exist /13 {7 : N — N and a decrement /{g (y : N xX — >■ X, 
satisfying the following for every d. If B is a factor of degree at most d whose rank is less 
than r{\B\), then there exists a semantic refinement B' of B for which \B'\ < h^\j{\B\) and 
indm(B') < 4^(1-61, indm(H)). 

Moreover, if B is in itself a syntactic refinement of some B that is of rank at least r{\B\) + 1, 
then additionally B' will be a syntactic refinement of B. 

Proof: We will deal with the first case, and then show how to modify the proof for the case 
where being a syntactic refinement of some B of the appropriate rank must be preserved. 

Let pi, . . . ,pc be the defining polynomials for B, where C = \B\. Suppose there is a linear 
combination over F that shows that B has a rank smaller than r(C). This means that for some 
(ai, . . . , ac) G F'^ \ {C^}, some arbitrary function : F' — > F and polynomials qi, . . . ,qi we 
have X^j=i OijPj{x) = B{qi{x), . . . ,qi{x)) for every x G F", where / < r(C) and every qi is of 
degree smaller than max{deg(pj)|aj 7^ 0} (a possible special case is where / = and S is a 
constant). 

We select jo so that aj^ 7^ and deg(pj(,) = max{deg(pj)|aj 7^ 0}, and construct B' by 
replacing pj^ with gi , . . . , gt/ . This is clearly a semantic refinement of B of complexity bounded 
by h{C) = C + r{C) — 1. Also, if / = (ii, . . .) was the degree index of B, then the degree index 
of B' is bounded above by the following k{C,I) = {ji, . . .): Letting k be the smallest number 
such that ik > 0, we set = i^. — 1, and if A; > 1 then we set jk^i = ik-i + ^(C") — 1; all other 
coordinates of k{C, I) are set equal to the respective coordinates of /. 

(r) (r) 

The above argument provides us with and Kg as required. 

Now we deal with an existing B as above. We follow the same argument, but argue that we 
can find jo for which aj^ 7^ that corresponds to a maximal degree polynomial, satisfying 
additionally jo > C = \B\. Assuming otherwise, we would find a counter example to the rank 
assumption on B: We would get that Yl'j=i (^jPji^) can be expressed as a function of qi, . . . ,qi 
and qi^i = otjPj^ which would all be of lower degree than max{deg(pj)|l < j < C, aj 7^ 

0} = max{deg(pj)|Q;j 7^ 0}, and would hence violate the rank of B. ■ 

Now we can combine the above two lemmas and prove the existence of high rank refinements. 

Lemma 3.18 There exists D^^fg{k) which takes two numbers k and d and a monotone function 
r : N — 7- N, and satisfies the following. For every factor B of bounded degree d there is a semantic 
refinement B' for which \B'\ < D^f'fg {\B\) , is of bounded degree d and has rank at least r{\B'\). 

Moreover, ifB is in itself a syntactic refinement of some B that is of rank at least r(-D3'^{g''(|i3|))-|- 
1, then additionally B' will be a syntactic refinement of B. 
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Proof: We set D^f^{k) = r3.i6(A;, d, /ig jy, K3 j^). Wc set Bq = B, and as long as Bi is of 
rank less than r(|;Bj|) we move to a semantic refinement Bi^i as guaranteed by Lemma 3.17. 

(r) (r) 

By Lemma 3.16 the sequence Bo,Bi, . . . has length bounded by m3.i6(A;, d, h):^ (7, k,^ {^j, and the 
final factor Bi is of rank at least r(|;Bi|) (otherwise we could have continued the sequence) and 

(r) (r) 

of complexity bounded by T^ iQ^k, d, {7, {^j. 

For the case of a prior factor B we just use the corresponding case of Lemma 3.17. ■ 

Now we finally state the main technical lemma that we will use for our decompositions. It will 

find a refinement that is both robust and of high rank, while not breaking a given syntactic 
refinement relation to a high rank factor if one exists. 

Lemma 3.19 (main robustness lemma) For an appropriate function Ts,iQ{k,h,d,r,'y) , for 
any B of degree bound d, monotone /i : N — >■ N and r : N — >■ N, and 7 G (0, 1) there exists 
a semantic refinement B' of B which is of rank at least r{\B'\) and {h,^)-robust, for which 
\B'\<n,,c>{\Blh,d,r,-f). 

Moreover, if B is in itself a syntactic refinement of some B that is of rank at least 
f{T3.i9{\i3\,h,d,r,'y)) + 1, then additionally B' will he a syntactic refinement of B (this holds 
also for the case where B = B). 

Proof: We set r3.i9(A;,/i, d, r, 7) = (r^^i^ik^h o D^(^ ,^)^ . Given B, we first use 

Lemma 3.13 to find B\ that is a syntactic refinement of B and is {h o Z)^'^{g\ 7) -robust. We 
then let B' be its semantic refinement according to Lemma 3.18 that is of rank r(|;B'|). The 
complexity of B' is at most D'^fg{\Bi\), and hence (apart from being bounded by the above 
T3,i9{\B\, h, d,r,j)) by Observation 3.12 it is (/i, 7)-robust as required. 

For the case where there is a prior factor B of the stated rank, we just use the corresponding 
case of Lemma 3.18. ■ 



4 Decomposition Theorems 

We use here the tools of the previous section to prove two decomposition theorems. First 
we state and prove the strong decomposition theorem (it is called "strong" on account of also 
guaranteeing high rank); similar theorems were proved in previous works, and we only make a 
seemingly small (yet crucial to what follows) addition that preserves a given syntactic refinement 
relation. Then we state and prove the super decomposition theorem, which uses the strong 
decomposition theorem (or more accurately the main lemma implying it) as a lemma. 

Super decomposition provides us with two successive factors, one being a syntactic refinement 
of the other. For the testing proofs, instead of using it directly, we will use a corollary that 
"chooses" out of the finer factor only one representative for each of the cells of coarser factor. This 
is done in the subcell selection corollary. The resulting representatives will satisfy properties 
that are stronger than what any one factor can satisfy by itself 
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4.1 Strong Decomposition 



First, a corollary of Theorem 3.2. 



Lemma 4.1 For d < p, suppose that B is a polynomial factor of degree d and complexity C, 
and suppose / : — > {0, 1} is such that \\f — '&[f\B\\\ud+i > 6. Then, there exists a refined 
polynomial factor B' of degree d and complexity at most C + 1 such that: 

\\nm']\\i>\\mmi + {^^.2{5,p)f 

where €3.2 is the function in Theorem 3. 2. 



Proof: g = f — ^[f\B] is bounded to [—1,1]. So, applying Theorem 3.2 yields a degree-d 

polynomial P satisfying | E[gi(x) • e(i-'(x))]| > e3.2((5,p). The polynomial P generates a factor 
B of complexity 1. Define B' to be the common refinement of B and B (by adding P to the 
polynomials defining B); its complexity is C + 1. 



Observe that: 



\n9\B']\U=E[\K[g\B']{x)\] =K[\E[g\B']{x) ■ e{P{x))\] 



> 



E[E[g\B']ix)-e{Pix))] = E ■ e (P(x))] > €3.2(5, p) 



where the second equality is simply due to |e {P{x)) \ = 1, and the third equality uses the fact 
that P is constant on each atom of B'. Now finally: 

iiE[/|B']iii - iiE[/|H]iii = \\mB']-nf\mi = wngmi > mgmi > eU6,p) 

where the first equality uses the fact that B' is a refinement of B. ■ 



The contra-positive of the above provides us with a function decomposition given a sufficiently 
robust polynomial factor. 

Lemma 4.2 For any rj and d < p there exist /i4.2 : N — )■ N and 74.2 (??)P); so that if B is 
(^4.2, 74.2(?7,p))-ro6ust (with respect to f) among factors of degree bound d overFp, then there 
is a decomposition / = /i + /2 where fi is constant over every atom of B and ranges in [0, 1], 
and /2 satisfies ||/2||[/fc ^ V o,f^d ranges in [—1, 1]. 

Proof: We set simply h,\,2{k) = k + 1 and '^a.2{tIiP) = ^3-2{f]jP)^- Given B as above we 
set /i = E[/|^] and /2 = f — E[f\B]. These functions clearly have the required ranges. The 
robustness condition of B implies the contra-positive of the conclusion of Lemma 4.1, and so we 
must have ||/2||[/fc < Tj as required. ■ 

However, we would like to make the Gowers norm bound also a function of For this we will 
decompose / into three functions, where the third "error term" function has a bound on its I2 
norm. In fact an I2 norm bound is what we need for an error term, but to reach even a constant 
I2 norm bound we cannot avoid having also the function that has "only" a Gowers norm bound. 
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Lemma 4.3 For any d <p, 5 and : N ^ there exist ^43 : N ^ N and 74.3(5), so that if 

B is {h^^f\'y4.3{S)) -robust (with respect to f) among factors of degree bound d, then there is a 
decomposition / = /1 + /2 + /3 where fi is constant over every atom of B and ranges in [0, 1], /2 
satisfies ||/2||[/fe < ^(I'^l) (^f^d ranges in [—1, 1], and /s ranges in [—1, 1] and satisfies II/3II2 < 
where fi + /a also ranges in [0, 1] . 

Proof: Wc set h^^f\m) = T3.13 (m, /i4.2,74.2('7('T^),p)) for every m G N and 74.3(5) = 5^. 
Given B satisfying the robustness condition above, we let B' be its syntactic refinement which 
is (/i4.2,74.2(??(|fi|),p))-robust and for which \B'\ < T3.13 (|^|, /i4.2, 74.2(??(|^|),p)). We let /i = 
E[/|i3], and /2 = f—E[f\B']. As per Lemma 4.2 /2 satisfies the required Gowers norm condition. 
This leaves us with fs = E[/|i3'] — E[/|H]. The required I2 condition on this function follows 
directly from B' not violating the robustness condition of B. ■ 

We now have all the tools to quickly wrap up the proof of the existence of a strong decomposition. 

Theorem 4.4 (Strong Decomposition Theorem) Suppose S > and Co,d > 1 are inte- 
gers so that d < p. Let r/ : N — >■ M"*" be an arbitrary non-increasing function and r : N — )■ N 
be an arbitrary non- decreasing function. Then there exists C = (74.4(5, ?7,p, r, Co) such that the 
following holds. 

Given / : — ;> {0, 1} and a polynomial factor Bq of degree at most d and complexity at most 
Co, there exist three functions /i, /2, /3 : — )• M and a polynomial factor B ^sem Bq of degree 
at most d and complexity at most C such that the following hold: 

• / = /l + /2 + /3 

• fi=nm] 

• ii/2iic7a+i < i/vm 

• II/3II2 <5 

• /i and fi + /3 have range [0, 1]; /2 and f^ have range [—1, 1]. 

• B is of rank at least r{\B\) 

Moreover, if Bq is a syntactic refinement of some B of rank at least r{C) + 1, then B will also 
be a syntactic refinement of fS (in particular this also holds if Bq = B). 

Proof: Set C4A{S,v,P,r,Co) = n.i9{Co,h'i'f ,p,r,j4.3{S)) > ^3.l9(Co,4Y^^''^. 74.3(5)). 
Given Bq and /, we set B to be the (/t4^3^\74.3(5))-robust refinement of Bq guaranteed by 
Lemma 3.19. Lemma 4.3 guarantees the required decomposition / = /i + /2 + fs, and the case 
of a prior B is also handled seamlessly by Lemma 3.19. ■ 
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4.2 Super Decomposition and Subcell Selection 

What we would reahy Uke is that in some sense the 6 of Theorem 4.4 would also be able to 
depend on \B\, but this is clearly impossible. So instead, taking some inspiration from [AFKSOO], 
we will strive to have a sequence of two factors B and B' , the latter a syntactic refinement of 
the former, so that the 6 of B' would be a function of \B\. However, for this to mean anything 
we also need B' to "faithfully" represent B, in the sense that we define now. 

Definition 4.5 (Polynomial factor represents another factor) Given a function f : 
¥p — {0,1}, a polynomial factor B' that syntactically refines another factor B and a real 
C £ (0,1), we say B' ("-represents B with respect to / if for at most a C, fraction of cells c 
of B, more than C, fraction of the cells d lying inside c satisfy \ IE[/|c] — E[/|c']| > (. 

To be able to infer that a refinement is representing, we will use the following well-known defect 
version of the Cauchy-Schwarz inequality: 

Observation 4.6 If Yl^i ~ ^ where Oi are all non-negative, / : / — t- M ranges over [0,1], 
and for some J Q I we have (^Zljg j / (l]jej"«) = I]ie/«*/(0 + ^ where r] G [-1,1], 

then ^fi^f) > iT.^eI + (E,ej «.) v'- 

Proof: For ease of notation denote the average a = ^ifi"^) f ^ — Z^jej'^i- 

By the standard Cauchy-Schwarz inequality '^Zi^i^iifi^)'^) — Z^ie/ '^«(/'(^)^)' where f'{i) = 
(EieJ«»/(0) / (EieJ«i) = a + r] ii i £ J and f'{i) = oufi^)) / (j2jei\J ^^i) = 

a — ^'q/{\ — ^) \ii ^ J. The sum over /' now equals ■?(a + ??)^ + (l— C)(fl~C^/(l~'?))^ ^ a^+^rf. 



We can now show that, under some rank assumptions, a non-representing refinement is evidence 
to a factor being non- robust. 

Lemma 4.7 There are functions r4^i{p,m) and 74.7(0 for which the following holds. For f : 
Fp — )• {0, 1}, if B' is a factor of rank r4.7(p, \B'\), and is a syntactic refinement of a factor B 
of rank ri,i{p, \B\), both of degree d < p, and B' does not (^-represent B with respect to f, then 
mdd{B') > indd(i3) + 74.7(C) ■ 

Proof: We first set r4.7(p, m) = r3j{p, 1/2^™). If B' does not (-represent B, then it must be 
the case that there are at least (pl^l /2 cells of B, so that for every cell c of them, there are at 
least Cp|S'l-|B|/2 cells c' of B' lying inside of it, so that | E[/|c] - E[/|c']| > (• 

Let us concentrate for now on one such cell c of B. Either there are at least (pl^'l~l^l/4 cells 
c' inside c so that IE[/|c'] — IE[/|c] > (, or there are more than (p'^ 1-1^1/4 such cells so that 
^^[/|c'] — IE[/|c] < — (. We will assume the first case, as the treatment of the second case is 
virtually identical and provides the same lower bound for the cell. 

Now we refer to Observation 4.6, where / is identified with Fp , the set of cells of B' lying 
in c, and J is identified with the set of those cells c' satisfying K[/|c'] — E[/|c] > (. The value of 
each Qj can easily be shown to be at least /3, by comparing the minimum possible size 

of the cell c' with the maximum possible size of the cell c. Inserting the other corresponding 
values in Observation 4.6, we obtain E[E[{f{x)f]\c] > (E[{f{x)f\c])'^ + C^/12. 
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Summing up the above contribution for all cells c of B, and noting that the relative size of 
every cell of B is at least p"l^l/2 by Lemma 3.8, we obtain that indd(6') = E[{f{x))'^\B'] > 
E[{f{x))^\B] + CV24 = indd(^) + 74.7(C), where we set 74.7(C) = CV24. ■ 

The following technical lemma shows that if the partition is robust enough, then it has a 
specified robust and representing syntactic refinement, where we also take a rank requirement 
into account. 

Lemma 4.8 For every /i : M N, 7 : N (0, 1), r : N N, p G N and ( £ (0, 1) there are 
H^hr/,p,r) .^^^^ dl^'P'''^ ■.n^'H and r4.8(C) G (0, 1) satisfying the following among factors 
of degree hound d < p over F^. If B is an {H[^g"''^'^\T4,8{Q)-robust partition of rank at least 

R^f'^'^\\B\), then it has a (^-representing syntactic refinement B' which is {h,j{\B\)) -robust 
and is of rank at least r{\B'\), which satisfies also \B'\ < 54.8(1^1, r, 7) for the appropriate 
function 84,$ {fn, h, p, r, 7) . 

Proof: Set the following in order: 

S4.8im,h,p,r,^) = T3.i9(m, /i,p,r,7(m)) 
Hihri,P,r)^^^ = S4.8{m,h,p,r,^) 

R4^f'^'''\m) = max{r(54.8(m, /t,p,r,7)) + l,r4.7(p,m)} 
r4.8(C) = 74.7(C) 

Assuming that B satisfies the requisites, we use Lemma 3.19 to find a refinement B' that 

is (/i, 7(|i3|))-robust, of rank at least r(|;B'|), and satisfying \B'\ < T-i,ig{\B\,h,d,r,j{\B\)) < 
T3.i9{\B\- h,p,r,^{\B\)) the required complexity bound (note that Lemma 3.19 is fed the num- 
ber 7(|^3|), not the function 7). 

The condition that B is (Z/^'^'^'^''^'*, r4.8(C))-semantically-robust means that indd(i3') < indd(23)-|- 
74.7(C), and so B' is (^-representing for B by Lemma 4.7 (as the partitions also satisfy the 
corresponding rank requirement). 

The condition that B is of rank at least Rlf'^'''\\B\) > r {T3,ig{\B\,h,p,r,-f{\B\))) + 1 means 
that (setting B = B) Lemma 3.19 provides the additional requirement that B' is a syntactic 
refinement of ;B. ■ 

We can now put forth our final decomposition theorem. 

Theorem 4.9 (Super Decomposition Theorem) Suppose C > and d, Co > 1 are integers 
so that d < p. Let r/ : N — >■ M"^ and 5 : N — >■ M"*" he arbitrary non-increasing functions, and 
r : N ^ N be an arbitrary non- decreasing function. Then there exists C = C4,g{5,ri,p,rX,Co) 
such that the following holds. 

Given / : — t- {0, 1} and a polynomial factor Bq of degree at most d and com,plexity at m.ost 
Co, there exist functions /i, /2, /a : F^ — )• M, a semantic refinement B of Bq of degree at most d 
and a syntactic refinement B' of B of degree at most d and of complexity at most C, such that 
the following hold: 

• / = /l + /2 + /3 
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. h=E[f\B'] 

• ||/2||[/<^+i<r/(|H'|) 

• ll/3||2< 5(1^1) 

• /i and fi + /a have range [0, 1]; /2 and fs have range [—1, 1]. 

• B is of rank at least r{\B\). 

• B' is of rank at least r{\B'\). 

• B' (-represents B with respect to f. 

Proof: Let the function 7 be defined by j{m) = 74.3(^(771)) and let h be defined by h{m) = 
hA.i\m). Then set: 

C4.9((5,7?,p,r,C,Co) = 54.8 (r3.i9 (Co,i^li'^'^'''\;>,i2£^'^'''\r4.8(C)) ,Kp,r,^) . 

Given Bq, we set B to be the semantic refinement that is guaranteed by Lemma 3.19 that 
is (i7f8'^'^''^\r4.8(C)) -robust and is of rank at least Rlf'^'''\\B\). \B\ will be bounded by 

r3.i9 (Co,H^f'^''\p,Rlf'^''\u.8{0)- Note also that > r{\B). 

Now we can use Lemma 4.8 to provide us a ^-representing syntactic refinement B' of B, that 
is of rank at least r(|H'|), and is (/t, 7(|B|))-robust and thus ^/i^^g^'', 74.3((5(|B|))^-robust. The 

factor B' satisfies the required complexity upper bound by substituting the bound on \B\ into 
the guaranteed complexity bound of Lemma 4.8. Finally Lemma 4.3 provides the required 
decomposition f = fi + f2 + fs over B'. ■ 

One could envision future applications in which we would need the whole of B'. Here we will 
need a careful choice of one cell of B' for every cell of B. This selection will satisfy the following: 

• The choice of cells will be made in a "uniform" manner. This part is helped by B' being 
a syntactic refinement. We will in fact set the "subcell ID" (the values of the polynomials 
appearing in B' and not in B) to be the same for all cells of B. 

• All the subcells will feature a ''good" decomposition, in terms of the norm of /3. 

• Most subcells will "well-represent" their corresponding cells from B, in terms of the corre- 
sponding conditional expectation of /. 

Now we state this formally. 

Corollary 4.10 (Subcell Selection) Suppose C > ('■''^d d > 1 is an integer less than p. 
Let 77, (5 : N — >■ M"^ be arbitrary non-increasing functions, and let r : N N be an arbitrary 
non- decreasing function. Then, there exist C = C4_io{5,rj,p,r,Q such that the following holds. 

Given / : — >■ {0, 1}, there exist functions /i, /2 , /3 : — >■ a polynomial factor B with cells 

denoted by elements of¥p , a syntactic refinement B' of B with complexity at most C and cells 

\B\ \B'\ — \B\ IS'I — ISI 

denoted by elements 0/ Fj, x Fp , and an element s € F], such that the following is 

true: 
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• / = /l + /2 + /3 

. /i = E[f\B'] 

• ||/2||t/<.+i <r/(|^'|) 

• /i and fi + /a have range [0, 1]; /2 and fs have range [—1, 1]. 

• B is of rank at least r{\B\) 

• B' is of rank at least r{\B'\) 

• For every c G f],^', the suhcell d = (c, s) G Fp^ ' has the property that EB{x)=cf[{h{x))'^\ < 

mB\)?- 

• Pr^gp|e|[|E[/|c] - E[f\{c,s)]\ > C] < C where we denote E[/|c] = E[f{x)\B{x) = c] and 
E[f]{c,s)] = E[f{x)\B'{x) = ics)]. 

Proof: Let r'(m) = rs.7{p,p~"^ /lO), so that by Theorem 3.7, a polynomial factor B of degree 
d and rank at least r'(|H|) satisfies for any c G Fj, ' 

0.9 p-l^l < Pr \B(x) = cl < 1.1 p-l^l. 

Set C4,\Q{5,rj,p,r,C,) = C4.9(A,?7,p,r",(^/4, 1), where A(m) = 0.1 • 5{m)/p^ and r"{m) = 
max(r(m), r'(m)). Apply Theorem 4.9 with ;Bo being the trivial partitioning consisting of one 
cell. This yields a factor B with rank at least r"{\B\), and a syntactic refinement B' of B with 
rank at least r"{\B'\). Let s be a uniformly chosen random element from Fj,' ' '. 

Observe that for every cell c G Fjf' of at most a O.lp"!^! fraction of the subcells d G 

{c}xfI^'I~I^I ofi3'haveE^[(/3(x))2|c'] > 5{\B\f. To show this, assume on the contrary that even 

10/1 

for one cell c G Fp this event does not occur, and denote by S the set of cells c' G Fp of B' that 
lie in c for which Ea,[(/3(x))^|c'] > 6{\B\Y. By our assumption IS*] > (0.1p"l^l)pl^'l"l^l , and then 
ll/3|li = IE.eF«[(/3(a;))2] > 5(1^1)2 Pr,eF,"[^(a;) = cAB'{x) G 5] > 0.09 5{\B\f/p'\^\ > A{\B\)^ 
a contradiction to the guarantee of Theorem 4.9. 

Hence, for any fixed c, the probability that s is such that 'Ex[{f3{x))'^\{c, s)] > (5(|.S|)2 is at 
most 0.1p~I^L By the union bound, with probability at least 3/4, for every c G F2^' the subcell 
c' = (c,s) has the property that E^[{fs{x))^\c'] < S{\B\f. 

Also, because B' (^/4-rcprcsents B, the expected number of cells c for which | E[/|c]— E[/|(c, s)]| > 
C is less than (/4 -pl^L So, by the Markov inequality, with probability at least 3/4 

Pr [|E[/|c]-E[/|(c,.)]|>C]<C 
We conclude that an s exists with both the desired properties. ■ 
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4.3 Extending to Multiple Functions 



The theorems so far referred to only a single function / : — )• {0, 1}. However, wc actually 
require decomposition theorems which work for several functions f^^\ . . . , /(^) : —?- {0, 1} 
simultaneously with a single polynomial factor; alternatively, this could be thought of as de- 
composing a single "vector" function / : ^ {0, 1}''^. 

It is quite straightforward to adapt all the previous proofs to this framework. The main adap- 
tation to be done is the following version of the definition of a density index. 

Definition 4.11 The density index of a factor B with respect to a vector function f = 
(/(^), . . . , /(^)) : Fp — > {0,1}^ is the sum of the squared I2 norms of the conditional expec- 
tation of the /W functions, that is indd(H) = Eti^ [(^[/(^^IH])^] . 

Given a function /i : N — >■ N and a real parameter 7, A factor B is (/i, 7)-robust (semantically) 
if there exists no B' which is a semantic refinement of B for which \B'\ < h{\B\) and indd(B') > 
indd(i3) + 7. 

Prom here we can follow nearly the exact same arguments. The main difference is that now all 
resulting bounds will depend on R, starting with the multiple functions analog analog of Ts.ig, 
as the index is now bounded by R rather than 1. Eventually we can reach the following version 
of the subcell selection theorem. 

Theorem 4.12 (Subcell Selection - Multiple Functions) Suppose ( > and d> I is an 

integer less than p. Let 77, (5 : N — > M.'^ be arbitrary non-increasing functions, and Zei r : N — >■ N 
be an arbitrary non- decreasing function. Then, there exist C = 04,12(6, r), p, r, R) such that 

the following holds. 

Given f^^\ f^^'^ : ¥^ {0, 1}, there exist functions /f \ f^\f^^ :¥^^R for all i G [R], 

a polynomial factor B with cells denoted by elements of ¥p , a syntactic refinement B' of B 

\B\ \B'\ — \B\ 

with complexity at most C and cells denoted by elements 0/ F}, x Fp , and an element 
\B'\ — \B\ 

s G Fp such that the following is true: 

• = f? + + for every ie[R\. 

• /« = E[/(^)|i3'] for every ie[R]. 

• ll/f llc/ci+i < ^(l^'l) for every i&[R]. 

• For every i € [R], f^^ and /i*'* + /3*^ have range [0, 1], and f2^ and /g*'* have range [—1, 1]. 

• B is of rank at least r{\B\) 

• B' is of rank at least r{\B'\) 

• for every c G Fjf' , the subcell d = (c, s) G ' has the property that¥x[{f^^\x))^ \ B'{x) = 
(c, s)] < for every i e [R]. 

. Pr^^p|B|[3,6[K]|E[/»|c] -E[/W|(c,s)]| > C] < C, where we denote E[/|c] = ¥[f{x)\B{x) = 
c] andE[f\{c,s)]=E[f{x)\B'{x) = {c,s)]. 
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5 Counting and Testability 



5.1 Counting Patterns inside Cells 



Let ;B be a polynomial factor generated by the polynomials Pi, . . . , Pc ■ — t- Fp, and let 
bi, . . . ,bm £ '^p denote the images of m cells of B. We will want to estimate probabilities of the 
following form: 

Pr [B{ai{xi, . . .,Xi)) = hi A B{a2{xi, . . .,Xi)) = 62 A • • • AB{am{xi, . . .,xi)) = bm] (2) 

Xl,...,X( 

where (ai, . . . , am) is an affine constraint of size m on i variables. In Lemma 3.8, we analyzed 
the expectation when i = m = 1 and ai(xi) = xi. In order to deal with the more general form, 
let us re-express (2) in the following way: 

Pr [;B(ai(xi, . . . , xe)) = &i A • • • A B{amixi, . . . , xg)) = bm] 

Xl,...,Xl 



E 

x-i_,...,xie¥^ 



n n X] e(Aij • (Pi(aj(xi,...,X£)) 



ie[C]je[m] ^ KjeWp 



P 



-mC 



ielC]j£[m] 



i6[C],je[ml 



1 E 


H 


/ Xl,...,Xl 





\_i&[C]je[m] 



(3) 



Hatami and Lovett in [HLlla, HLllb] studied expectations such as those in (3) and proved the 
following dichotomy. 

Lemma 5.1 (Lemma 5.1 in [HLllb]) Suppose we are given e £ (0, 1), positive integer d < p 
and an affine constraint {A, a) where A = (ai, . . . ,0^) is of size m and over (. variables. Let 
Pi, ... , Pc : Fp — )• Fp be a collection of polynomials of degree at most d such that the rank of the 
polynomial factor generated by Pi, . . . , Pc is at least r^^^^d, e). Then, for every set of coefficients 
A = {Xij G Fp : i G [C],j £ ["t-]}; Pa '■ (^p Y — ^ Fp is the polynomial defined by: 

C m 

1=1 j=i 



then either P\ is the zero polynomial, or else 



^xx,...,xt&^&{PA{xi, . . ■ ,xe)) 



< e. 



Thus, to bound (3), we need to count the number of sets A such that Pa = 0, in the language 
of Lemma 5.1. To this end, let us make the following definition, following the works of Gowers 
and Wolf [GWlOb, GWlOa]. 



Definition 5.2 (Dimension of linear forms) For a positive integer d and linear form 
L{Xi, . . . , Xi) = aiXi + 02X2 + • • • + ae^i where ai,...,ai G Fp, let the dth tensor power of 
L denote: 
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Given positive integers di, . . . ,dc and an affine constraint A = (ai, . . . , Um) of size m on i 
variables, define the {di, . . . , dc) -dimension of A to be: 

f^dim({af%...,a«'^'}) 

i=l 

To show the relevance of the above definition, we first need an algebraic "all or nothing" lemma 
from [HLllb] that concerns linear and polynomials without explicitly referring to the dimension 
of the forms. 

Lemma 5.3 (Lemma 5.2 in [HLllb]) Suppose Xi j £ ¥p for i £ [C],j G [m], and 
di, . . . ,dc G [d], where d < p. Also, let {A, a) where A = {ai, . . . , a^) be an affine constraint, 
where every linear form aj is over variables Xi, . . . ,X£. Then, one of the following holds: 

• For every collection of linearly independent polynomials Pi, ... , Pc of degree di, . . . ,dc 
respectively: 

c m 

^5^A.,,P.(a,(Xi,...,X,)) = 

i=i j=i 

• For every collection of linearly independent polynomials Pi, ... , Pc of degree di, . . . ,dc 
respectively: 

c m 
i=i j=i 

Now we can make the connection between the definition of the dimension of the linear forms, 
and their effect on a sequence of polynomials with given degrees. 

Lemma 5.4 Let the notation here be same as in Lemma 5.1. If di, . . . ,dc are the respective 
degrees of the polynomials Pi, ... , Pc and if s is the {di, . . . , dc)-dimension of (oi, . . . , Um), then 
the number of sets A for which P/^ = equals 

Proof: Notice that we want to show that the number of sets A for which Pa = is dependent 
just on the degrees of the polynomials Pi, ... , Pc and not on any other specifics. For this we use 
Lemma 5.3, so that instead of having the polynomials Pi, . . . ,Pc, we can analyze a collection 
of much simpler linearly independent polynomials of respective degrees di, . . . ,dc. 

In particular, let us define P/(x) = xf^ for every i £ [C] (we assume that n > C). Then, 
the polynomial P^{Xi, . . . , Xi) = Ylf=i Y^T=i ^ij^'i ('^j(^i) • • ■ ' ^f)) is identically zero exactly 
when Yl^=i -^jj^f"^' = for every z G [C]. 

Standard linear algebra and the definition of {di, . . . , (ic')-dimension then shows that the set of 
A's for which P^ = forms a linear subspace of codimension s. ■ 

At this point, we can move to the main theorem of this section. Let us first make the following 
definition, that in some ways captures the essence of "polynomial feasibility" for a sequence of 
values. 
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Definition 5.5 Given an affine constraint A = (oi, . . . , a„i) and positive integers di, . . . , dc, 
we say that elements bi, . . . , bm, where bj = . . . , bc.j) G for every j G [m], are consistent 
with respect to A and di, . . . ,dc if the following is true: 

• For every set A = {Xij G ¥p : i G [C], j £ [m]} for which Eje[m] hji^^ji^i^ ■ ■ • ,^^))®'^' 
equals for all i G [C], it is the case that ^^Zjelm] ^i,3^i,j = as well for all i E [C]. 

The following is easy to observe using basic linear algebra: 



Observation 5.6 Being consistent is equivalent to satisfying the following condition: For every 
set A = {Xij G Fp : i G [C],j G [m]} for which EjeH '^ijlojl^i' • • • ,^^))®* equals for all 
i G [C], we have f^ieic] T.je[m] = 0- 

The following theorem shows that the expectation in (2) is nonzero, and is in fact close to a 
calculated number, if and only if 6i, . . . , 6^ are consistent. 

Theorem 5.7 Let e G (0, 1), let {A, a) where A = (ai, . . . , ajn) be an affine constraint over i 
variables, and let B be a polynomial factor of degree d, complexity C and rank at least r2.7{d, e) 
generated by the polynomials Pi, ... , Pq : — >■ Fp. For every i G [C], let di be the degree of P^. 
Let s denote the {di, . . . , dc)-dimension of A over Fp. Finally, for every j G [m\, fix the image 







)f a cell in B, indexed by bj = {bij, . . . , bcj) G Fp . 
If bi, . . . ,bm are consistent with respect to A and di, . . . , dc, then: 

Pr [;B(ai(a;i, . . .,xe)) = hi A ■ ■ ■ AB{am{xi, . . . = bm] =p~^ ±e 
xi,...,xcew^ 

Ifbi,..., bra are not consistent with respect to A and di, . . . , dc, then the above probability is 0. 

Proof: Assume first that the supposition is true. Let us rewrite the probability in question 
as in (3): 



\j^^p-- \ ie[C]ie[m] 

ie[C],3e[m] 



1 E 


[•( 


/ Xl,...,X£ 





ie[C]ie[m] 



According to Lemma 5.1, the expectation in the above expression is at most e in absolute value 

if X]jg[m] ^ijPii'^jiXii ■ ■ ■ j^e)) is not the zero polynomial. On the other hand, by the 

argument of Lemma 5.4, if J2ie[c],je[m] hiPii^-ji^i^ ■ ■ ■ ^ ^i)) = 0' then Eie[qje[m] ^i^f * 
equals 0. Hence, in this case, by consistency, ^j£[m] ^ijbij = 0, and so, such a choice 

of {Xij} contributes 1 to the outermost summation. The number of such choices of {Xij} is 
ptnC-s |-jy Lemma 5.4. Thus: 

Pr [Vi G [C],j G [m] (a,(xi, . . . , xe)) = = p-"^^' (p'"'^-^ ± p-^e) =p-^±e 
xi,...,xieT?^ 

The last part of the Theorem follows easily. Suppose the probability in question is nonzero, and 
so there exist Xi,.. . ,Xi so that B {aj{xi, . . . , Xi)) = bij for all i G [C] and j G [m]. Then, for 
all possible values of Xij we have Y,i(z[c],je[m] ^ij^ij = J2i(^[c],je[m] ^ij^i • • -^xe)). But, 

by the argument of Lemma 5.4, J2je[m] Kj^i • • -^^e)) = if J2je[m] Kji^^f^*) = 

any i G [C], and so the supposition is true. ■ 
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By now we know the importance of the definition of consistency. One more building block that 
we need shows why, when selecting cells from a refining partition as in Theorem 4.9, consistency 
will pass over from B to B' . 

Lemma 5.8 Suppose that B' is a syntactic refinement ofB, that A = (ai, . . . , am) is a sequence 
of linear forms where (A, a) is some affine constraint, and that ci , . . . , are consistent 

\B'\ — \B\ 

with A and the degrees di, . . . , c?|g| of the polynomials defining B. Given any fixed s G Fj, , 
\B'\ 

the cells c'^, . . . , G Fp defined by the concatenations c'- = {cj, s) for all j G [m] are consistent 
with respect to A and the degrees di, . . . , d^Qi^ of the polynomials defining \B'\. 

Proof: Coordinate-wise, for j G [m], c'j = (c'^j, j, . . . , cjg,| j) G FI^'I satisfies c'^j = Cij for 
all 1 < z < \B\ and c^^ = Si_\B\+i for all \B\ < i < \B'\. 

To show the consistency condition for i > recall that each aj is of the form Xi + Yli=2 CrXr 
for Cr- G Fp (as it came from an affine constraint). So, whenever J2j£[m] ^iji^j)^'^'' — ^ 
di > 0, we have that Yljs[m] ^iJ ~ ^' simply by looking at the sum along the coordinate of a?"^' 
corresponding to xf'^'- (i.e. the one labeled by the sequence (1, . . . , 1); the vector a®*^' will always 
be 1 at that coordinate). Since for any i > \B\, c'^j = Sj_|g|_|_i is independent of j, it follows that 
for any i > \B\, if J2je[m] hji^j)^'^' = 0, then EjgH = Eje[m] Kj = all 

i > \B\. Since we already know (by the consistency of the Cj relative to B) that X^jgj^j K,jc[ j = 
for alH < we can conclude the proof. ■ 



5.2 Big Picture Arguments 

We will prove the existence of many copies of a given linear constraint by analyzing the existence 

of a particular configuration of cells of a factor B, where in every cell we look at the entire set 
of values that / can take at once. The following is a formal definition of the function giving the 
"big picture". 

Definition 5.9 Given a function / : F^ — >■ [R] and a polynomial factor B, the big picture 

function of f is the function fs '■ Fjf' — )■ 2^^\ where 2^^' denotes the power set of R, defined by 
fsiv) = {/(a^) '■ I3{x) = y}. In other words, feiv) is the set of all values that f takes within the 
corresponding cell of B. 

On the other hand, given any function ^ : F^ ^ 21^1, and a set of degrees di, . . . ,dc (of which 
we think as corresponding to the degrees of some future polynomial factor of size C), we will 
define what it means for such a function to "induce" a copy of a given constraint. 

Definition 5.10 (Partially induce) Suppose we are given positive integers di,...,dc, a 
function g : F^ — >■ 21^], and an induced affine constraint {A, a) of size m over t variables. We 
say that g partially {di, . . . , (ic')-induces {A, a) if there exist {bj = (6ij, • . . , bcj) G F^ : j G [m]} 
making the following true. 

• bi, . . . ,bm are consistent with respect to A and di, . . . ,dc. 

• aj G g{bj) for every j G [m]. 
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The big picture function defined above extracts a finitary description of a function / : — )• [R] 
in relation to some B, which we will later obtain through a decomposition theorem. Regardless 
of how we obtained B, moving from an induced constraint of / to a partially induced constraint 
of the big picture function is always guaranteed. 

Observation 5.11 7/ / : — t- [i?] induces a constraint {A, a), then for a factor B with degree 
sequence (di, . . . , (where all degrees are smaller than p), the function fig : ¥p — )• 2[^1 
partially (di , . . . , ) -induces {A, a). 

Proof: Let m be the size of A and i be its number of variables. Suppose that F induces 
{A, o") at xi, . . . , X£, and let ci, . . . , Cm, G Fp be the images of the m cells in B defined by ci = 
B{ai{xi, . . . ,Xi)),C2 = B{a2{xi, . . . ,Xi)), . . . , Cm = B{am{xi, . . . , xe)) where A = (ai,...,am). 
Then, because of the last condition in Theorem 5.7, it must be the case that ci, . . . ,Cm are 
consistent with respect to A and di, . . . , This fulfills the first condition of Definition 5.10, 
and the second condition is true by the definition of every /^(ci) including all values that / 
takes in that cell. ■ 

To handle a possibly infinite collection A of affine constraints, we will employ a compactness 
argument, analogous to one used in [AS08b] to bound the size of the constraint partially induced 
by the big picture function. Let us make the following definition: 

Definition 5.12 (The compactness function ^X) Suppose we are given a positive integer 
C and a possibly infinite collection of induced affine constraints A = {{A^,a^),{A'^,a'^), . . .}, 
where each affine constraint (A*,(T*) is of size rrii and of complexity at most d < p. For fixed 
di, . . . ,dc < p, denote by G{di, . . . , dc) to he the set of functions g : ¥p — )• 21^1 that partially 
{di, . . . , dc)-induce some {A^, o"*) E A. Now, we define the following function: 

^j\_{C) = max max min rrii 

dl,...,dc<P geG{dl,...,dc) (A'',a') partially 
induced by g 

Whenever Q{di, . . . ,dc) is empty we set the corresponding maximum to 0. 

Note that the above is indeed finite, as both the number of possible degree sequences (bounded 
by p^) and the size of G{di, . . . , dc) (bounded by 21^'^ ) are finite. The compactness function 
allows to bound an induced constraint in advance, at least (for now) in the realm of big picture 
functions: 

Observation 5.13 Let di, . .. ,dc < p be a degree sequence, for which a function g : F^ — t- 
2[^1 partially induces some constraint from A. Then g will necessarily partially induce some 
(^*,cr*) G A whose size is at most ^ji^{\B\). 

Proof: This is immediate, as a g satisfying the above in particular belongs to Q{di, . . . , dc)- 



For our proofs, we will refer first not to / itself, but to some small modification of / that will 
make it a "perfect" representation of some cells from / according to some factor, which will be 
selected as per Corollary 4.10. 



28 



Definition 5.14 (Function cleanup) Suppose we have a factor B' that is a syntactic refine- 
ment of B, and some s € F[f ' The ^-cleanup F of f : ¥p ^ [R] according to B, B' 
and s is constructed by executing the following steps in order ( where as usual (c, s) denotes the 
concatenation of c and s): 

1. For every z G that is not covered by the cases below, let F{z) = f{z). 

2. For every cell cofB for which | Pr[/(a;) = | c] — Vv[f{x) = | (c, > ^ for any i E [R], 

do the following. For every z G B~^{c), set F{z) = arg maxjgj^] Pr[/(a:) = j \ (c, s)], 
the most popular value inside the subcell (c, s) (breaking ties arbitrarily, but consistently 
within each cell c). 

3. For every cell c of B, for every i G [R] such that Pr[/(a;) = i \ (c, ,s)] < set F{z) = 
argmaXjgj^^j Pr[/(a;) = j \ (c, s)] for every z G f~^{i) riB~^{c) (breaking ties arbitrarily, 
but consistently within each cell c). 

Lemma 5.15 // /, B, B' and s are such that B is of rank at least r^^j^p, 13/p^^^), and 
Pr^^jp,|B| [| E[/|c] — E[/|(c, > C] < C; then the corresponding (-cleanup F is (2i?+ 1 + /3)^-cZose 
to// 

Proof: Observe that the second step changes the value of F on at most a ( fraction of the 
cells, by the condition involving s in the statement of the lemma. By Lemma 3.8, each cell 

occupies at most a (1 + I3)p~^ fraction of the entire domain. So, the fraction of points whose 
values changed in the second step is at most C,p^ • (1 + I3)p~'-' = (1 + /3)C. 

The third step does not apply to any cell of B affected by the second step. Therefore, in the third 
case, for every i G [R], if Pr[/(x) = i \ B'{x) = (c, s)] < ( then Pr[/(x) = i \ B{x) = c] < 2(^. 
Hence, the total fraction of the domain modified in the third case is at most 2R(. The total 
distance of F from / is therefor bounded by {2R + 1 + ■ 



5.3 More about Algebra of Linear Forms 

A linear form a{Xi, . . . , Xg) = Yli=i can be identified with a linear function over F^, and 
thus a transformation in the spirit of "a change of basis" can be formulated. 

Definition 5.16 (Change of view) We identify the form a{Xi, . . . , X^) = Yli=i'^i-^i ^^^^ 
the linear function a : F^ — )• Fp given by a{v) = Yli=i'^i'^i> "(inhere v = {vi, . . . ,V£) G ¥f (in 
essence this is obtained by letting the Xi range over scalars from ¥p rather than vectors from 
some space F^ 

Given an invertible i x i m,atrix M over Fp, the corresponding change of view of a is the linear 
form, a'{Xi, . . . , X^) = 'Yl!i=i o^'i^i obtained by the following process: Consider the linear function 
corresponding to a, perform on its domain F^ the change of variables corresponding to M, and 
then take the linear form corresponding its representation a' in the new basis. 

The reason that we use the term "change of view" is to not confuse it with a change of basis of 
Fp. The following observation is easy: 
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Observation 5.17 If {A, a) is an affine constraint, and A' is obtained by performing the same 
change of view over all linear forms of A, then a function / : — >■ F satisfies {A, u) if and 

only is it satisfies {A\a). 

Additionally, a change of view does not affect the complexity of the affine constraint. 
This yields the following lemma: 

Lemma 5.18 Any affine constraint {A, a) is equivalent to one whose number of variables is 
not more than the number constraints. 

Proof: Assume that A = (ai, . . . ,0^) take £ variables for i > m, and consider the linear 
functions from F^ to F corresponding to ai, . . . , a^. By a linear dimension argument there are 
£ — m linearly independent vectors ui, . . . ,Ui-rn ^ for which aiivj) = for all i G [m] and 

j G [£ — m]. Complete these vectors to a basis ui, . . . ,U£ of Fp, making sure that ue equals the 
vector that is 1 on its first coordinate and zero everywhere else (this vector is not in the span 
of ui, . . . , u^— m G Fp, because by the definition of an affine constraint ai sends it to 1). 

Now perform on the members of A the change of view corresponding to the change to this basis 
of Fp. Denoting the resulting linear forms by ^' = (a'^, . . . , a^), we note now that no a[ has any 
mention of the variables Xi, . . . ,X^_„j, and so the constraint (A', a) in fact takes at most m 
variables. A' will also have the standard form of an affine constraint with taking the place 
of Xi. ■ 

We need the above because the test would eventually query a number of places that is a function 
of p and the maximum number of variables in a subset of the constraints of A, where this subset 
is only guaranteed a bound on the number of linear forms per constraint; we thus need A to 
satisfy the following definition: 

Definition 5.19 (Concise collections) The collection A = {{A^,a^),{A'^,a^),...} is called 
concise if for every Ai, the total number of its variables does not exceed the number of its linear 
forms. 

Lemma 5.18 implies that every collection of linear constraints is equivalent to a concise one. 

We would also need to know the (lack of) affect that a change of view has on the d-dimension, 
and hence the {di,. . . , dc')-dimension, of A. 

Lemma 5.20 If A = (ai, . . . ,am) is a sequence of linear forms, and A' = ( 

sequence of the resulting forms after a fixed change of view, then A and A' have the same 

d-dimension for any d. 

Proof: We use the identification of linear forms with linear functions from Fp to Fp, and by 
extension for a linear form a we consider the vector a®*^ as the multilinear function a®'^ : (F^)*^ — >■ 

Fp that sends {v^^\ . . . ,f'^'^)) to nf=i aiv^^^)', the representation of this multilinear function in 
the standard basis indeed corresponds to the vector originally defined as a®"^. 

The operation that takes a to a®'^ is not linear in itself; however, a change of basis over Fp 
(corresponding to the change of view) can be extended to an invertible linear operation over the 
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linear space of all multilinear functions of d vectors (not all of which come from linear forms). 
Namely, if M is the basis change matrix, then the change of view for a sends it to the function 
defined by a'{v) = a{Mv), and (a')®'^ in fact corresponds to Y[i=i Now by basic linear 

algebra, the operation that sends any multilinear form b : (F^)'^ — > ¥p to the form b' defined by 
b'{v^^\ . . . , v^^^) = a{Mv^^\ . . . , Mv^'^) is linear and invertible; thus the d-dimension, and in fact 
the exact corresponding linear dependencies, do not change when moving from A = (oi, . . . , a^) 
to A' = {a[,...,a'J. U 



We end this section with a lemma about a "juxtaposition" of two sets of identical forms while 
sharing one variable. 



Lemma 5.21 Suppose that (a'^, . . . , a^) are linear forms over (Xi, . . . , Xi) of d- dimension q, 
where for some k the form a'j^ sends {Xi, . . . , X^) to Xi. The d-dimension of the following 2m 
linear forms over {Z, X2, . . . , Xc, Y2, . . . ,Yi): 

{a[{Z,X2,...,Xe),...,a'^iZ,X2,...,Xe),a[{Z,Y2,...,Ye),...,a'^{Z,Y2,...,Ye)) 

is exactly 2q — 1. 



Proof: We note that a'i^{Z, X2, ■ ■ ■ , Xi) = a^(Z, Y2, . . . , 1^) = Z, and that all other linear 
forms are distinct. Abusing notation somewhat, we let Z denote also the linear form that 
returns the value of Z from the variables {Z, X2, ... , X^, Y2, . . . ,Yf); note that in particular Z®*^ 

corresponds to the vector from Fp ' that is 1 on its coordinate corresponding to (1, . . . , 1), 
and zero everywhere else. 

Let S C {1, ... ,rn}\{ A;} be a set of size q-1 such that | {a'j{Z,X2, . . . : j G 5U {/c}| 

is a basis of size q for the linear space span | ^aj(Z, X2, . . . , X^)^ : jG Clearly, 

i^(a'j{Z,Y2,...,Y^)y^ : j G 5U{A;}| is a basis for span | (a;.(Z, ^2, • • • , ^m)) J G h|. 

Thus, the d-rank of the 2m linear forms is at most 2q — 1. To conclude, we will show that the 
d-rank is at least 2q—l. To this end, we analyze the intersection 

span { (a;. (Z, X2, . . . , X^)) ®^ j G 5 U {fe}} n span { (a^ {Z,Y2,..., Ym)f'' : j G S U {fe} } . 

It is clearly contained in spanlZ®*^}, since no other coordinate can be non-zero in both sets 
(the left set can have only non-zero coordinates corresponding to sequences of length d over 
{!,...,£}, and the right set can have only non-zero coordinates corresponding to sequences of 
length d over {1, £ + 1, . . . , 2^ — 1}). On the other hand, the intersection contains (and hence is 
equal to) span {.Z®^}, because this vector appears on both sides (as a^). This shows by a linear 
dimension argument that the d-dimension of the 2m linear forms is exactly 2^ — 1 as claimed. 



5.4 The Proof of Testability 

We finally have all the building blocks in place to prove Theorem 1.8, which implies Theorem 
1.7. 
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Proof of Theorem 1.8: We begin with some preliminaries. Let d be the maximum complexity 
of an affine constraint appearing in A. By hypothesis, d < p. For i £ [R], define Z*-*-* : — t- 
{0,1} so that f^'^\x) equals 1 when f(x) = i and equals otherwise. Additionally, set the 
following parameters, where '■ '^'^ is the compactness function of A. 

p(C) = r3.7(d,«(C)) 
= 16 

^ 8R 

£j[ and will be defined, based on the above functions, in (4) and (13) below. 

Next, apply Theorem 4.12 to the functions f^^\f^'^\ ■ ■ ■ , f^^^ in order to get polynomial factors 
^' '^syn B of degree d and size at most C4.i2(A, r/,p, p, i?), an element s G Fj,' ' , and 
functions f^\f^\f^^ : ^ M for every i e [R]. The sequence of polynomials generating 
B' will be denoted by Pi,. . . ,P\b'\- Since B' is a syntactic refinement, B is generated by the 
polynomials Pi, ... , 

Let F be the (^-cleanup of / with respect to B, B' and s. By Lemma 5.15, and what we know 
of these partitions and s, F is e/2-close to /, and hence by our assumption on the farness of /, 
the function F will still include an induced constraint from A. 

By Observation 5.11, the big picture function Fq of F will (di, . . . , (i|g|)-partially induce some 
constraint from A, and hence by Observation 5.13 it will partially induce some {A^, cj*) for which 
iTT-i < ^yld'^D- This will be the constraint of which we will find many copies in the original /. 
Let nv^rrii , let I ii, and let ui , . . . , am denote a\,..., aJ^ respectively. Since a concise A 
means that ii < rrii, we can now define 

Uie) = *^(C4.i2(A,r/,p,p,C,i?)). (4) 

Denote the linear forms in by ai,...,a^ and denote cr* = (ai, . . . , a^n)- Let ci = 

\B\ 

(ci,i, . . . , C|B|^i), . . . , Cm = (ci,m, • ■ • , C|S|,m) ^ index the cells of B where {A\ ct') is par- 
tially induced by Fs, the big picture function of the cleanup function F, i.e., ci, . . . ,Cm are 

consistent, and ai G Fis{ci) for every j € [m]. Also, let c'^, . . . , G Fp ' index the associated 
subcells of B' , obtained by letting d- = (cj, s) for every j G [m]. 

Our goal will now be to lower bound: 

Pr ^ [f{ai{xi,...,xe)) = ai ■ A f{am{xi,...,Xi)) = Um] 

x-i_,...,xiG¥" 



E 

xi,...,x(ev^ 



/('^i)(ai(xi, . . . , xe)) ■ ■ ■ &^\am{xi, . . .,xt)) (5) 



The theorem obviously follows if the above expectation is more than the respective (5^(e). We 
rewrite the expectation as: 



E 

Xl,...,Xn&^ 



(6) 
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We can expand the expression inside the expectation as a sum of 3"* terms. The expectation 

of any term which is a multiple of f2"^^ for any j G [m] has an absolute value upper bound of 

(yd+i < vil^'Di because of Lemma 3.3 and the fact that the complexity of A* is bounded 
by d. Hence, the expression (6) is at least: 



E 

xi,...,xe 



ifi"'^ + /^^)(ai(^i, • • ■ ■ ■ + f^^-'Xamixi, . . .,xe)) - 3^v{\B'\) (7) 



Before we continue, to ease notation, for the rest of the proof we will now define an indicator 
function. xf^^' "'^'"^>,(xi, . . . ,X() will be set to 1 if B'{aj{xi, . . . .xp)) = c'- for every j G [ml, and 
it will be set to otherwise. 

Now, because of the non-negativity of j^'^ + j^'^ for every j G [m], the expectation in (7) is 
at least: 



E 

X\,...,Xl, 



In other words, what we are doing now is counting only patterns that arise from the selected 
subcells c^, . . . We next expand the product inside the expectation into 2"* terms. The 
main contribution will come from: 



E 

X\,...,X^ 



fP' (ai(a:i, . . . , x,)) ■ ■ ■ fi^^^ (a^(xi, . . . , x,)) • I^^-^ixi, ...,x,) 



(8) 



But first, let us show that the contribution from each of the other 2"* — 1 terms is small. 
Consider a term that contains f^'^''^ for some k G [m]. Letting g denote an arbitrary function 
with 1 1 5 1 loo ^ 1) such a term is of the form: 



E 

X1,...,X£ 



f^'"'\ak{xi, xe))g{xi, ...,xi)- l[lll[''^a"^)ixi, ...,xe) 



(9) 



By our definition of affine constraints, ak{xi, . . . ,xe) is of the form xi + ctiXi for some 

ai G Fp. We now change the summation variables of the expectation by replacing xi with 
z = xi + OiiXi, affecting a change of view for oi, . . . , am- Letting a'^, . . . , denote the 

linear forms as they appear after the change, we first note that a'i,{Z, X2,. . ., Xg) will equal Z. 
We can now bound the square of (9) using Cauchy-Schwarz as: 



E 
xi,...,xe 



< I E 

.Z,X2,...,Xe 



< E 

2 L 



•E 

z 



<A'{\B\)-Pv[B'{z) = c',]-E 

z z 

< A2(|H|)-(p-l^'l+a(|H'|))-E 



E 

X1,...,Xl 



E 

x2,—,xe. 



^ 1 ^ 

n - E e(A,,,--(P,(a;.(z,X2,...,x,)) 

ie[\B'\] ^ Aij-eFp / 

I jS[m] / 
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< — —E 

- p2\B'\m+\B'\ z 



< 



2A^(|g|) 

p2\B'\m+\B'\ 



E 

\ie[|B'|],3eIm] 

/ 

E 



ie[|e'|] 

3e[m] 



ie[|B'|] 



ie[|B'l].36M 



<e[|B'|] 

je[m] 



ie[|B'|] 

je[m] 



E 



iel\B'\] 
jelm] 



— ■ ■-3,1 2^ 



- p2\B'\m+\B'\ 



ie[|s'|],je[m] 



^ I m KjPii^^'ji^^ X2,..., Xi)) e - ^ TijPi{a'j{z, y2, ■ ■ • , y^)) 

/ \ <e[|e'|] 

/ \ 

^ XijPi{a'j{z, X2,..., xe)) TijPi{a'j{z, y2, • • • , y^)) 



E 

2 ja;2 , . . . 
3/2. 



ie[|s'|] 

3e[m] 



ie[|B'|] 

jS[m] 



(10) 



Now, by Lemma 5.20, the {di, . . . , c?|g/|)-dimension of {ai, . . . , am} equals the {di, . . . , 
dimension of {a'^, . . . , a^}. 

Let 5 denote the (di, . . . , -dimension of {ai, . . . , am}- By Lemma 5.21, summing over all of 
(cZi, . . . , we know that the (di, . . . , (i|e/|)-dimension of 

(a;(Z, X2, . . . , X^), . . . , a;„(Z, X2, . . . , X^), a;(z, . . . , F^), . . . , a;„(z, r2, . . . , 1^^)) 

is exactly q— \B'\. 

Now, just as in the proof of Theorem 5.7, the above information is enough to upper-bound (10). 
The above {di, . . . , (i|g/|)-dimension bound and Lemma 5.4 allow us to count the number of Ajj 
and Tij such that the quantity inside the expectation in (10) is identically 1, and Lemma 5.1 
along with the high-rank condition on the polynomials Pi bounds the expectation otherwise. It 
follows that (10), and therefore the square of (9), is at most: 

2A2(|^| 



l^p2m\B'\-(2,-\B'\) + p2m|B'|^(|^/pj < 2A2(|S|) • {p-^'i + a{\B'\)) (11) 



p2Tn\B' 

Finally, we lower-bound the contribution from the main term (8). To begin with, we need to 
convince ourselves that / induces many copies of {A\ a*) among the subcells c'^, . . . , c'^. Recall 
that ci, . . . ,Cm are consistent with di, . . . , and A^, and that ai G FB{ci) for every i G [m]. 
By Lemma 5.8 c'^, . . . , are consistent with di,. . . , d\s'\ and as well. 

We can now lower-bound (8) as follows: 



E 

Xl,...,Xl 



ft'\ai{xi, . . . , xe)) • • • /^'"'(a„^(xl, . . . , x,)) • l}^-;;^^(xi, ...,xe) 



p(o"m) 



= Pr[i3'(ai(xi, . . .,xe)) = c'l A ■ ■ ■ AB'{amixi, . . . ,x^)) = c'„]- 

ft'\a,ixi, . . .,xe)) ■ ■ ■ f[""'\am{xu xe))\yj G [m] B'{aj{xi, . . . , x,)) = ^ 

(12) 



E 
xi,...,xe 



>&>-'-«(|s'l))-(^)" 

Let us justify the last line. The first term is due to Lemma 5.8 and the lower bound on the 
probability from Theorem 5.7. The second term in (12) is because each f^'^^^ is constant on the 
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cells of B' , and because by construction, the big picture function Fg of the cleanup function F, 
on which (A*, o"*) was partially induced, supports a value inside a cell c of B only if the original 
function / acquires the value on at least an e/(8i?) fraction of the subcell (c, s). 

Combining the bounds from (7), (11) and (12), and using our parameter settings, we get that 
(5) is at least: 

{p-'i - a{\B'\)) • (^)" - V2A2(|H|)-(p-2« + a(|H'|)) - 3^ • v{\B'\) 
> 4 (si?) 

where both \B\ and \B'\ are upper-bounded by C4.i2(A, r/,^, p, C, -R) • We can now define 

<5_4(e) = lp-*A(C4.i2(A,r,,p,p,C,ii))C4.i2{A,r,,p,p,C,il) . ^a{C4.M^,V,P,P,<:,R)) ^^^^ 

to conclude the proof. ■ 
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